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Abstract — Considerable  recent  experimental  evidence  sug¬ 
gests  that  significant  stochastic  fluctuations  are  present  in  gene 
regulatory  networks.  The  investigation  of  stochastic  properties 
in  genetic  systems  involves  the  formulation  of  a  mathematical 
representation  of  molecular  noise  and  devising  efficient  compu¬ 
tational  algorithms  for  computing  the  relevant  statistics  of  the 
modeled  processes.  However,  the  complexity  of  gene  regulatory 
networks  poses  serious  computational  difficulties  and  makes 
any  quantitative  prediction  a  difficult  task.  Monte  Carlo  based 
approaches  are  typically  used  in  study  of  complex  stochastic 
systems,  but  they  often  suffer  from  long  computation  times, 
slow  convergence,  and  offer  little  analytic  insight.  The  recently 
proposed  Finite  State  Projection  (FSP)  approach  provides  an 
analytical  alternative  that  avoids  many  of  the  shortcomings 
of  Monte  Carlo  methods,  but  thus  far  it  has  only  been 
demonstrated  for  a  certain  class  of  problems.  In  this  paper 
we  show  that  the  applicability  of  the  finite  projection  approach 
can  be  enhanced  by  taking  advantage  of  tools  from  the  fields  of 
modern  control  theory  and  dynamical  systems.  In  particular,  we 
present  an  approach  that  utilizes  singular  perturbation  theory 
in  conjunction  with  the  Finite  State  Projection  approach  to 
improve  the  computation  time  and  facilitate  model  reduction. 
We  demonstrate  the  effectiveness  of  the  resulting  slow  manifold 
FSP  algorithm  on  a  simple  example  arising  in  the  cellular  heat 
shock  response  mechanism. 

1.  Introduction 

Through  evolution  living  organisms  have  developed  com¬ 
plex  robust  control  mechanisms  with  which  they  can  regulate 
intracellular  processes  and  adapt  to  changing  environments. 
These  mechanisms  are  triggered  by  small  changes  in  sen¬ 
sitive  control  parameters,  yet  their  functions  are  unaffected 
by  relatively  large  variations  in  the  system  that  may  even 
include  damage  of  whole  parts  of  the  mechanism  itself. 
Understanding  control  processes  that  take  place  inside  a  cell 
is  key  to  understanding  fundamental  biological  functions  of 
living  organisms.  However,  despite  dramatic  new  develop¬ 
ments  in  experimental  study  of  intracellular  processes,  the 
quantitative  study  of  these  systems  is  significantly  hindered 
by  computational  complexity. 

For  many  chemical  processes  reactants  are  present  in  large 
numbers  and  the  reactions  can  be  modeled  at  the  macroscopic 
level  ordinary  or  stochastic  differential  equations  (ODEs 
or  SDEs).  Inside  a  cell,  however,  matters  can  be  entirely 
different.  Here  some  chemicals  such  as  proteins  or  RNA 
molecules  may  be  present  in  only  a  few  copies  per  cell. 
Since  these  trace  chemicals  may  effect  important  changes 
in  the  cell’s  behavior,  they  cannot  be  ignored,  yet  they 
obviously  cannot  be  treated  as  continuous  concentrations. 


This  consideration  has  given  rise  to  a  relatively  new  branch 
of  computational  research  in  chemistry:  stochastic  modeling 
of  chemical  reactions  at  the  mesoscopic  level. 

For  a  discrete  population  models  with  N  chemical  species, 
the  configuration  of  the  system  at  any  time  is  specified  by 
the  population  of  each  species:  x  :=  [  ^2  ■  •  ■  ]^- 

The  configuration  space  of  the  system  is  the  N  dimensional 
non-negative  integer  lattice  that  contains  all  possible  config¬ 
urations.  Over  time,  the  reacting  system  follows  a  trajectory, 
where  each  individual  reaction  is  represented  by  a  jump  from 
the  current  configuration  to  another  within  the  configuration 
space.  With  the  assumptions  that  the  system  is  well  mixed, 
has  constant  volume  and  is  kept  at  constant  temperature, 
the  problem  becomes  autonomous  of  time,  and  the  waiting 
time  between  jumps  can  be  modeled  as  an  exponentially 
distributed  random  variable  [1].  The  probability  of  each  con¬ 
figuration  can  then  be  described  with  the  linear  time  invariant 
ordinary  differential  equation  known  as  the  Chemical  Master 
Equation  [2]. 

Until  recently,  it  was  believed  that  the  CME  was  in¬ 
tractable  except  in  the  simplest  of  circumstances.  As  a  result, 
most  discrete  state  stochastic  models  have  been  studied 
almost  exclusively  with  the  simple  Monte  Carlo  simulation 
technique  known  as  the  stochastic  simulation  algorithm  SSA 
[3].  Here  random  numbers  are  generated  for  every  individual 
reaction  event  in  order  to  determine  (/)  when  the  next  reaction 
will  occur,  and  (if)  which  reaction  it  will  be.  However,  for 
most  systems,  huge  numbers  of  individual  reactions  may 
occur,  and  the  SSA  is  often  too  computationally  expensive. 
Two  types  of  recent  approximations  have  been  made  to 
improve  efficiency  of  the  SSA:  time  scale  based  system 
partitioning  routines  and  r  leaping  routines.  In  the  first  type, 
efficiency  is  gained  by  separating  the  system  dynamics  into 
slow  and  fast  parts  [4]-[8].  If  one  is  interested  in  short  time 
scales,  the  slow  processes  are  considered  to  be  constant,  and 
only  the  dynamics  of  the  fast  partition  are  considered.  If  one 
is  interested  in  long  term  behaviors  of  the  system,  then  the 
fast  dynamics  are  averaged  in  order  to  concentrate  upon  the 
slow  dynamics.  In  the  second  type  of  approximation  to  the 
SSA,  some  probabilistic  aspects  of  the  system  are  assumed  to 
be  constant  over  short  discrete  time  intervals,  and  under  this 
assumption  one  can  develop  a  faster  Monte  Carlo  algorithm 
that  leaps  from  one  time  step  to  the  next  [9]-[14].  Both 
approximation  types,  system  partitioning  and  t  leaping,  have 
been  very  successful  in  increasing  the  scope  of  problems  to 
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which  the  SSA  may  be  reasonably  applied. 

Moving  in  an  entirely  different  direction,  we  recently 
introduced  the  Finite  State  Projection  (FSP)  method  as  a  new 
approach  for  directly  approximating  the  solution  to  the  CME 
[15].  The  FSP  provides  a  systematic  method  to  perform  bulk 
system  reductions  on  the  possibly  infinite  dimensional  ODE 
in  order  to  find  a  finite  dimensional  analytical  approximation 
to  the  CME.  Unlike  Monte  Carlo  algorithms,  the  ESP  does 
not  rely  on  random  number  generation  and  provides  precise 
lower  and  upper  bounds  on  its  approximation  accuracy. 
While  the  earliest  implementations  of  the  PSP  have  shown 
much  promise  in  dealing  with  a  handful  of  biologically 
inspired  models  [15],  [16],  the  method  is  still  limited  in 
problems  where  accuracy  requirements  demand  considera¬ 
tion  of  an  exorbitantly  large  state  space.  In  these  cases, 
model  reduction  techniques  can  be  utilized  to  make  the 
problem  dimension  more  manageable.  Por  example,  drawing 
upon  concepts  of  reachability  and  controlability  from  modern 
control  theory,  one  can  significantly  reduce  the  order  of  the 
PSP  [17].  Additionally,  since  the  SSA  benefits  so  strongly 
from  system  partitioning  methods,  it  is  natural  to  ask  whether 
the  same  or  similar  tools  may  improve  upon  the  PSP.  In  this 
paper,  we  show  how  the  PSP  can  reap  huge  benefits  from 
time-scale  separation  based  system  partitioning  approaches 
and  the  application  of  linear  perturbation  theory  [18].  This 
times-scale  separation  is  exploited  to  develop  the  Slow- 
Manifold  PSP  algorithm,  which  we  present  here. 

In  the  next  section  we  provide  some  background  on  the 
basic  PSP  method.  In  section  III  we  illustrate  how  linear 
perturbation  theory  can  be  used  to  reduce  the  order  of  the 
PSP  problem  and  we  formulate  the  Slow-Manifold  PSP 
algorithm.  In  section  IV  we  demonstrate  this  algorithm  on 
an  example  from  the  field  of  systems  biology.  Pinally,  in 
section  V,  we  conclude  with  remarks  on  the  advantages  of 
these  approaches  over  the  original  PSP  and  outline  a  few 
directions  for  future  work  on  the  topic. 

II.  Background 

We  consider  a  well-mixed,  fixed-temperature,  and  fixed 
volume  system  of  N  distinct  reacting  chemical  species. 
At  any  point  in  time,  we  use  an  integer  vector  x  := 
[Cl  C2  •  •  •  Cm  to  describe  the  discrete  populations 
of  the  N  molecular  species.  Each  x  is  a  unique  configuration 
of  the  system,  and  the  nonnegative  set  is  the  set  of 
all  possible  configurations.  We  will  a  priori  fix  a  sequence 
Xi,X2, ...  of  elements  in  and  define  X  :=  [xi,X2, ... 
as  the  ordered  configuration  set.  Beginning  at  any  x^,  suppose 
that  there  can  be  a  maximum  of  M  possible  reactions, 
where  each  reaction  leads  to  a  different  configuration:  x^  — > 
Xj  =  Hi  +  v^.  Por  each  reaction,  is  the  associated 
stoichiometry  (directional  transition  on  the  configuration  set), 
and  is  the  associated  propensity  function.  Define 

Pi{t)  =  as  the  probability  that  the  system  will  have 

the  configuration  at  time  t.  At  an  incrementally  small  time 
later,  pfit  +  dt)  is  equal  to  the  sum  of  (i)  the  probability  that 
the  system  begins  in  x^  at  t  and  remains  there  until  t  +  dt, 
and  (ii)  the  probability  that  the  system  is  in  x^  7^  x^  at  f  and 


will  transition  to  x^  during  the  time  step,  dt.  This  probability 
can  be  written  as: 

Pi{t  +  dt)  =  pi{t)  -  X]  o,f,(x,)d^ 

M 

T  ^  [  pj^i  ^fi)dt.  (1) 

Prom  Eqn  1  it  is  easy  to  derive  the  differential  equation 
known  as  the  Chemical  Master  Equation,  or  CME  [3]: 

M  M 

Pi  (t)  —  Pi  (t)  ^  (^i)  ^  [  P(^i  ?  t)(lfi  (Hi  ^^i)  • 

fi—1  M— t 

(2) 

By  combining  all  possible  reactions  that  begin  or  end  with 
the  configuration  point,  x^,  the  time  derivative  of  the  proba¬ 
bility  density  of  x^  can  be  written  in  vector  form  as: 


T 

ai(xi  -  vi) 

p(Hi  -Ui),t) 

Pi(i)  = 

02  (Xi  —  V2) 

p(Hi  -  1^2),  t) 

1 

1 

_ 1 

_  p(Hi  -  VM),t) 

Based  upon  our  enumeration  of  X,  we  can  combine  the 
equations  for  every  Pi(t)  into  ordinary  differential  equation 
(ODE): 

P(f)=AP(f),  (4) 

where  P(f)  is  the  probability  distribution  at  time  t.  The 
generator  matrix  A  is  uniquely  defined  by  the  reaction  stoi¬ 
chiometries  and  propensities  and  the  choice  of  the  enumera¬ 
tion  of  X.  A  has  the  properties  that  it  is  independent  of  f;  all 
of  its  diagonal  elements  are  non-positive;  all  its  off-diagonal 
elements  are  non-negative;  and  all  its  columns  sum  to  zero. 
The  solution  to  Eqn  4  beginning  at  f  =  0  and  ending  att  =  tf 
is  the  expression:  P(f/)  =  ^(0,tf)  ■  P(0).  In  the  case  where 
the  configuration  set  is  finite  dimensional,  the  operator, 
^(0,tf),  is  the  exponential  of  Atf,  and  one  can  compute 
the  solution:  P(f/)  =  exp(Af7)P(0).  However,  when  A  is 
infinite  dimensional  or  extremely  large,  the  corresponding 
analytic  solution  is  unclear  or  vastly  difficult  to  compute. 
In  these  cases,  one  may  devise  a  systematic  means  of 
approximating  the  full  system  using  finite  dimensional  sub¬ 
systems.  This  truncation  approach  lies  behind  the  rationale 
for  the  Pinite  State  Projection  (PSP)  method  [15]. 

We  must  introduce  some  convenient  notation.  Pet  J  = 
{ji)  J2j is,  •  ■  •}  denote  an  index  set.  Por  any  two  index  sets  / 
and  J,  let  J  C  /  denote  that  I  contains  every  element  from 
J,  and  let  J  [J  /  and  J  P|  /  be  the  union  and  intersection, 
respectively,  of  J  and  I.  Pet  J'  denote  the  complement  of 
the  set  J,  where  unless  stated  otherwise  the  complement  is 
taken  with  respect  to  the  set  of  all  non-negative  integers.  If 
X  is  an  enumerated  set  {xi,X2,X3, . . .},  then  Xj  denotes 
the  subset  {xj^jXj^jXjj, . . .}.  Purthermore,  let  vj  denote 
the  subvector  of  v  whose  elements  are  chosen  according  to 
J,  and  let  A/j  denote  the  submatrix  of  A  such  that  the  rows 


have  been  chosen  according  to  I  and  the  columns  have  been 
chosen  according  to  J.  For  example,  if  /  and  J  are  defined 
as  {3, 1,2}  and  {1,3},  respectively,  then; 


a 

b 

c 

9 

k  ■ 

d 

e 

f 

= 

a 

c 

_  9 

h 

k  _ 

ij 

.  ^ 

/  . 

For  convenience  let  A  j  denote  the  principle  submatrix  of  A, 
in  which  both  rows  and  columns  have  been  chosen  according 
to  J.  Unless  otherwise  stated,  all  vector  inequalities  are  to  be 
interpreted  componentwise.  With  this  notation  we  can  restate 
the  following  theorem  from  [15]: 

Theorem  1  If  A  G  has  no  negative  off-diagonal 

elements,  then  for  any  index  set,  J, 

[expAjj  >  exp[Aj]  >0.  (5) 

Ref.  [15]  provides  a  detailed  proof  of  this  theorem.  The 
matrix  A  in  Eqn  4  has  no  negative  off-diagonal  terms  and 
therefore  satisfies  the  assumptions  of  Thm  1.  Consider  two 
finite  configuration  subsets  Xj^  and  Xj^,  where  J2  3  J\. 
Since  the  full  probability  density  vector,  P(0)  is  nonnegative, 
Thm  1  assures  that; 


Step  0  Choose  an  initial  finite  set  of  states,  Xj^,  for  the  FSR 
Initialize  a  counter,  i  =  Q. 

Step  1  Use  propensity  functions  and  stoichiometry  to  form  A 
Compute  Fj;  =  exp(Aj;f/)Pj;(0). 

Step  2  If  Fj.  >  1  —  e.  Stop. 

exp(Aj.fy:)Pj. (0)  approximates  to  within  e. 

Step  3  Add  more  states  to  find 

Increment  i  and  return  to  Step  1. 

For  the  basic  FSP  algorithm,  if  we  wish  to  find  a  solution 
that  is  accurate  to  within  e  at  a  time  fj,  we  must  find  a  finite 
set  of  configurations  such  that  the  probability  of  ever  leaving 
that  set  during  the  time  interval  [0,f/]  is  less  than  e.  For 
many  problems,  including  the  examples  shown  in  [15],  [16], 
this  set  of  configurations  may  be  small  enough  that  we  can 
easily  compute  a  single  matrix  exponential  to  approximate 
the  solution  to  the  CME.  However,  in  other  situations  that 
may  not  be  the  case.  In  the  next  section  we  apply  some 
concepts  of  linear  perturbation  theory  to  reduce  the  order 
of  the  ESP  problem  and  thereby  significantly  improve  the 
efficiency  and  expand  the  applicability  of  the  ESP  algorithm. 

III.  The  Slow-Manifold  PSP  Algorithm 


[e\p{Aj^tf)]j,Pj,{0)  >  exp(Aj,f/)Pjj(0)  >  0. 

where  Pj^  is  the  probability  distribution  of  the  elements 
of  X  indexed  by  Ji.  This  result  guarantees  that  as  one 
increases  the  finite  configuration  subset,  the  approximate 
solution  increases  monotonically  for  each  element  in  the 
configuration  subset. 

In  addition  to  being  non-negative,  P(f)  sums  to  exactly 
one.  These  properties  and  the  nonnegativity  of  the  off- 
diagonal  elements  of  A  allow  one  to  state  the  following  result 

[15]. 

Theorem  2  Consider  a  Markov  process  in  which  the 
probability  distribution  evolves  according  to  the  linear  ODE, 
P(f)  =  AP(f),  where  A  has  no  negative  off-diagonal 
entries.  If  for  some  finite  index  set  J,  £  >  0,  and  f/  >  0, 

l^exp(Ajf/)Pj(0)  >  1  -  £,  (6) 

then 

exp(Ajf/)Pj(0)  <  Pj(f/),  and  (7) 

l|P./(i/) -exp(Ajf/)Pj(0)||^  <  £.  (8) 


In  Step  I  of  the  PSP  algorithm  above  we  are  faced  with 
solving  an  N  dimensional  LTI  system: 

P(f)=AP(f),  (9) 

where  A  is  made  up  from  the  propensity  functions  of  the 
various  reactions.  In  gene  networks  we  can  often  identify 
m  clusters  of  configurations  within  which  transitions  occur 
quite  frequently,  while  transitions  between  the  clusters  are 
relatively  rare.  Let  H  be  a  generator  matrix  that  is  made 
up  from  the  frequent  transitions  within  these  clusters,  and 
let  G  be  the  generator  made  up  of  the  remaining  transitions 
from  one  cluster  to  another  such  that  A  =  H  -f  G.  With 
some  permutation  of  the  configuration  space,  H  has  a  block 
diagonal  structure,  H  =  diagjHi,  H2, . . . ,  H^},  where 
each  block  is  itself  a  generator  corresponding  to  a  single 
cluster  of  fast  interconnected  configurations. 

Each  Hi  is  itself  a  generator  and  therefore  has  an  eigen¬ 
value  equal  to  zero,  which  corresponds  to  a  left  eigenvector 
Ui  =  and  a  positive  right  eigenvector,  v^,  where 
is  scaled  such  that  u^Vi  =  =  1.  We  use  these 

eigenvectors  to  define. 


While  Theorem  1  guarantees  that  as  we  add  points  to 
the  finite  configuration  subset,  the  approximate  solution 
monotonically  increases.  Theorem  2  provides  a  certificate 
of  how  close  the  approximation  is  to  the  true  solution. 
Together  the  two  theorems  lead  us  to  the  PSP  algorithm 
presented  in  [15]: 


Ui  0 

’  Vi 

0  . 

0  U2  ... 

,  and  V  = 

0 

V2  • 

U  = 


Let  S  =  [  V  S 


The  Finite  State  Projection  Algorithm 

by  S-i  = 

Inputs  Propensity  functions  and  stoichiometry  for  all  reactions. 


R  \  he  a  square  matrix  in  which  each 
column  is  an  right  eigenvector  of  H  that  has  been  scaled 
to  have  an  absolute  sum  of  one.  The  inverse  of  S  is  given 

^  ^  such  that  we  have  the  following  similarity 


Initial  probability  density  vector,  P(0). 
Pinal  time  of  interest,  tf. 

Total  amount  of  acceptable  error,  £  >  0. 


transformation  for  H: 


S-^HS  = 


0 

0 


0 

A 


,  A  =  diag(Am+i,...,AAr). 


where  the  first  m  diagonal  elements  correspond  to  the  zero 
eigenvalues  of  the  blocks,  and  we  label  non-zero  eigen¬ 
values  of  H  so  that  0  >  Re{Am+i}  >  Re{Am+2}j--  -  > 
Re  {A  at}.  If  we  apply  the  coordinate  transformation 


’  yiW 
.  y2{t) 


s-ip(f) 


up(f)  ■ 

SlPW  J  ’ 


Eqn  9  becomes: 


■  yi(f)  1  _  r  UGV  VGSn  1  f  yi(f)  ' 

y2(f)  SlGV  A+SlGSr  y2(f)  ' 

(10) 

If  we  can  make  a  clear  distinction  between  frequent  and 
rare  reactions,  then  we  are  assured  that  the  magnitude  of 
GV  is  small  with  respect  to  A.  For  convenience  we  define 
^  “  |Re{A^  ^i}l  ||GV||j^  and  Q  =  jGV,  and  we  can  make 
coordinate  substitution  (yi,  y2)  =  (y,  sz)  to  get  the  standard 
singular  perturbation  model. 


■  y(f)  1  _  r  eUQ  eUGSfl  1  f  y(f)  ' 
ez{t)  \  [  sSlQ  e(A  +  SiGS^)  J  [  z{t)  ' 

(11) 

We  can  apply  linear  perturbation  theory  to  show  that  |z(f)|  < 
7(f)  -f  0(e),  where  7  =  |z(0)|  exp(Am-i-if)  is  the  transient 
error  of  the  fast  manifold.  For  the  slow  manifold,  we  have 


y(f)  =  exp(UGVf)y(0)  -h  0{s), 
for  all  time  f  >  O'. 

The  reverse  transformation  then  yields 

p  _  q  r  yiW  1  _  q  r  yW 
[  y2(f)  J  [  ez(f)  _ 

=  Vy(f)+0(£) 

=  Vexp(UGVf)y(0)  -h  0{e) 

=  Vexp(UGVf)UP(0)  -h  0(e). 


It  is  important  to  note  that  due  to  truncation  of  Fqn  10, 
only  contributions  of  the  first  m  eigenvectors  of  H  affect 
the  approximate  solution.  Therefore,  instead  of  calculating 
full  eigensystems  for  each  block  H^,  it  suffices  to  find  only 
eigenvectors  associated  with  zero  eigenvalues. 

Applying  this  model  reduction  to  the  original  FSP 
algorithm  yields  the  following  algorithm  which  we  name 
the  Slow-Manifold  FSP  algorithm: 


The  Slow-Manifold  FSP  Algorithm 

Inputs  Propensities  and  stoichiometries  for  all  reactions. 
Initial  probability  density  vector,  P(0). 

Final  time  of  interest,  tf. 

Target  FSP  error,  J  >  0. 

Step  0  Choose  initial  set  of  states,  Xj^,  for  the  FSP. 
Initialize  a  counter,  k  =  0. 

Step  1  Use  fast  reactions  connecting  states  within  Xj^,  to 
form  =  diagjHi, . . . ,  H^}. 

Use  remaining  reactions  to  form  Gj^. 

Step  2  Find  eigenvalues  and  vectors  of  each  Hi  and 

’For  a  detailed  derivation  of  these  errors,  see  the  appendix  of  [18]. 


build  matrices  U  and  V. 

Estimate  e  =  ||Gj^V||^  /  |Am-ri|- 
Compute  transient  error  7  =  |SaP(0)|]^  exp(Am  +if/)- 
Step  3  Find  =  Vexp(UGj,Vf/)UPj,(0) 

and  compute  Fj,  = 

Step  4  If  Fj^  >  1  —  1),  Stop. 

approximates  Pjf.(tf)  within  (5  -I-  7  +  0{s). 
Step  5  Add  more  states  to  find  Xj^.^^. 

Increment  k  and  return  to  Step  1. 

Above  we  have  used  the  non-traditional  error  estimate 
notation  <5  -I-  7  +  0{e)  to  mean  the  following.  If  6  is  largest, 
then  the  dominant  error  is  most  likely  the  result  of  the 
projection,  and  the  slow  manifold  truncation  error  can  be 
ignored.  If  7  is  largest  then  the  time  f/  is  too  short  for 
the  transient  dynamics  to  sufficiently  diminish  and  additional 
eigenvectors  must  be  included  in  the  truncation.  Finally,  if 
e  is  larger  than  S  and  7,  then  there  is  insufficient  separation 
between  the  slow  and  fast  dynamics  and  an  alternative 
reduction  scheme  may  be  required. 

In  the  following  section  we  will  illustrate  this  algorithm 
on  model  of  the  heat  shock  response  in  E.  coli. 

IV.  Example 

In  order  to  deal  with  frequently  changing  situations, 
living  organisms  require  mechanisms  to  deal  with  harsh 
environments.  One  such  mechanism  is  the  cellular  heat  shock 
response.  At  elevated  temperatures  proteins  inside  the  cell 
begin  to  misfold  and  lose  their  geometry  dependent  function¬ 
ality.  In  an  attempt  to  stop  this  misfolding,  the  cell  produces 
heat  shock  proteins  including  molecular  chaperones  and 
proteases.  By  helping  to  refold  deformed  proteins,  molecular 
chaperones  can  restore  the  original  functionality  of  those 
proteins.  Proteases,  on  the  other  hand,  help  degrade  damaged 
proteins  before  those  damaged  proteins  impair  the  function 
of  the  cell.  The  core  of  the  heat  shock  response  mechanism 
in  E.  coli  is  the  synthesis  of  the  (T32-RNAP  complex  [19]. 
Here  we  study  a  simplified  model  for  (J32-RNAP  synthesis 
using  the  slow  manifold  finite  state  projection  algorithm. 

At  normal  physiological  temperatures  1732  protein  is  found 
almost  exclusively  in  a  complex  1732 -DnaK.  However,  as  the 
temperature  increases  this  complex  becomes  less  stable  and 
the  probability  of  finding  free  CT32  inside  the  cell  increases. 
Once  free,  (T32  can  combine  with  RNA  polymerase  to  form 
a  stable  1732-RNAP  complex,  which  initiates  production  of 
heat  shock  proteins.  This  simple  regulatory  mechanism  is 
summarized  by  a  set  of  three  reactions. 

Si  ^  S2  ^  S3,  (13) 

where  Si,  S2  and  S3  correspond  to  the  (732-DnaK  complex, 
the  1732  heat  shock  regulator  and  the  (732-RNAP  complex, 
respectively.  This  model  of  the  heat  shock  subsystem  has 
been  analyzed  before  using  various  computational  methods 
including  Monte  Carlo  implementations  [7],  [20]. 

In  the  biological  system,  the  relative  rates  of  the  reactions 
are  such  that  the  reaction  from  S2  to  si  is  by  far  the  fastest. 
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Fig.  1.  (a)  Two  dimensional  integer  lattice  representing  possible  configu¬ 

rations  of  the  toy  heat  shock  model.  Here  52  and  53  are  populations  of  free 
(J32  molecules  and  cr32-RNAP  compounds,  respectively,  while  si  is  the 
population  of  (T32-DnaK  compounds.  Reactions  5i  ^  52,  are  represented 
by  bidirectional  horizontal  arrows  and  reactions  S2  — ^  53  is  represented 
with  diagonal  arrows.  The  total  number  of  (J32  is  constant,  so  the  chemical 
state  of  the  system  is  uniquely  defined  by  52  and  53  alone,  (b)  The  same 
lattice  after  applying  the  finite  state  projection.  Unlikely  states  have  been 
aggregated  into  a  single  sink  state.  Each  horizontal  row  of  configurations  is 
separated  from  the  rest  by  the  slow  reaction  3  and  the  is  used  to  form  the 
fast  block  generator  Hj 


and  (J32  molecules  infrequently  escape  from  DnaK  long 
enough  to  form  the  1732 -RNAP  complex.  The  purpose  of 
this  mechanism  is  to  strike  a  balance  between  fixing  the 
damage  produced  by  heat  and  saving  the  cell’s  resources, 
as  a  significant  portion  of  cell  energy  is  consumed  when 
producing  heat  shock  proteins.  The  optimal  response  to  the 
heat  shock  is  not  massive,  but  measured  production  of  heat 
shock  proteins,  which  leaves  sufficient  resources  for  other 
cellular  functions.  We  use  the  following  set  of  parameters 
values  for  the  reaction  rates: 

Cl  =  10,  C2  =  4  X  10"^,  C3  =  2, 

si(0)  =  2000,  S2(0)  =  53(0)  =  0. 

With  only  the  reactions  above,  the  total  number  of  1732- 
free  or  in  compounds-is  constant,  so  that  Si  +  S2  +  S3  = 
const.  With  this  constraint  the  reachable  states  of  this  three 
species  problem  can  be  represented  on  a  two  dimensional 
lattice  as  shown  in  Figure  1  a.  For  our  initial  conditions  there 
are  2,001,000  reachable  states,  and  the  full  chemical  master 
equation  is  too  large  to  be  tackled  directly,  and  we  wish  to 
find  an  approximate  solution  using  the  slow  manifold  FSP 
algorithm  at  time  tf  =  300s.  To  do  this  we  follow  the  steps 
of  the  algorithm  as  follows: 


Step  0:  We  specify  the  problem  parameters  as  given  above. 
We  choose  a  target  1-norm  error  of  <5  =  0.001  for  the 
distribution  at  time  tf,  and  we  choose  an  initial  FSP  set 
Xjg  that  includes  all  configurations  such  that  S3  <  200  and 
S2  <  11. 

Step  1:  We  form  the  fast  generator  Hj^,.  For  the  config¬ 
urations  in  Xjq,  reaction  1  has  propensities  ranging  from 
18,000  to  20,000,  reaction  2  has  propensities  ranging  up 
to  110,000,  and  reaction  3  has  a  propensity  that  ranges  no 
higher  than  22.  Thus  there  is  a  clear  separation  between  the 
fast  transitions  (reactions  1  and  2)  and  the  slow  transitions 
(reaction  3).  Furthermore,  since  S3  changes  only  during  the 
slow  reaction,  each  level  of  S3  =  i  defines  a  cluster  of  fast 
interconnected  configurations;  these  are  shown  as  horizontal 
rows  in  Figure  1.  We  will  use  the  index  set  /j  to  denote  the 
set  of  configurations  in  the  cluster.  The  generator  is 
formed  from  all  of  the  reactions  that  begin  and  end  within 
X/. .  In  this  case  there  are  12  points  in  each  cluster  corre¬ 
sponding  to  S3  =  i,  and  S2  =  {0, 1,2,...,  11};  if  we  were  to 
change  the  number  of  points  in  our  projection,  the  size  and 
number  of  clusters  would  change  accordingly.  The  full  fast 
generator  is  defined  Hj  =  diag(H7(,,  H/^ , . . . ,  For 

the  reaction  rates  given  above,  the  first  nonzero  eigenvalue 
of  Hj  can  be  computed  to  be  Xm+i  ~  ~4  x  10^. 

The  rare  transition  generator  Gj  is  then  formed  from 
the  remaining  transitions  coming  from  two  components:  (/) 
reaction  3  corresponding  to  a  transition  from  set  X/^  to 
X/.^j^,  and  (ii)  all  transitions  taking  the  system  from  the 
set  Xj  to  its  complement  Xj/. 

Step  2:  We  build  the  matrices  U  and  V  from  the  left  and 
right  eigenvectors  corresponding  to  the  zero  eigenvalues  of 
Hj.  We  compute  the  norm  ||GjV||j^  «  2.0,  and  use  this  to 
estimate  how  well  the  time  scale  is  separated  between  the 
frequent  and  rare  events: 


£  = 


G./VII1 

l'^m-t-1 1 


2 

4  X  104 


=  0.5  X  10"^ 


We  also  compute  the  transient  error  due  to  the  initial  con¬ 
ditions,  7  =  |S7,P(0)|^ exp(Am_|_if/)  «  exp(— 1.2  x  10^), 
which  is  far  smaller  that  6,  and  can  therefore  be  neglected. 

Step  3:  We  apply  perturbation  theory  and  approximate  the 
FSP  solution  as 


P5’r(^/)  «  Vexp(UGj,Vf/)UPj,(0), 

which  has  error  7  -f  0(e)  compared  to  the  unreduced  FSP 
solution.  In  order  to  determine  whether  this  FSP  solution  is 
sufficient,  we  compute  Fj^  =  =  7.3  x  10“®. 

It  is  not. 

Step  4:  Since  the  Fjg  <  1  —  5,  we  need  to  add  more 
configurations  and  return  to  Step  1.  If  we  define  Xjj^  to 
include  all  configurations  such  that  S3  <  250  and  S2  < 
11,  we  find  Fj^  =  0.034,  which  is  little  better.  If  we 
further  expand  the  configuration  set  to  include  up  to  300 
or  350  molecules  of  S3,  we  compute  Fj^  =  0.921  and 
Fjj  =  0.99997  respectively.  The  FSP  solution  on  the  set 
Xjj  exceeds  the  precision  of  our  stopping  criteria,  and  we 
know  that  approximates  the  true  solution  within 


Fig.  2.  Probability  distribution  for  S3  calculated  at  tf  =  300s.  The  slow 
manifold  FSP  solution  (dots)  approximates  well  the  FSP  solution  (smooth 
solid  lines).  The  jagged  grey  line  represents  the  corresponding  slow  scale 
SSA  solution  after  10“^  realizations.  See  also  Table  I. 

S  =  0.001.  Furthermore,  since  e  =  0.0005  and  7  <C  <5,  the 
error  introduced  by  the  time  scale  separation  is  of  the  same 
order. 

For  this  problem  we  are  interested  in  determining  how  the 
population  of  0-32 -RNAP  compounds  grows  in  time  if  the 
temperature  is  constant  and  slightly  above  normal  physiolog¬ 
ical  level.  This  number  is  proportional  to  the  number  of  heat 
shock  proteins  produced  in  the  cell.  With  the  original  FSP 
algorithm,  we  have  computed  the  probability  distribution  for 
S3  at  fy  =  300s,  and  Figure  2  illustrates  this  probability 
distribution.  The  FSP  solution  is  shown  with  solid  lines,  and 
the  dots  represent  the  distribution  of  S3  as  computed  using 
our  current  slow  manifold  FSP  algorithm.  The  difference 
between  the  two  results  is  indistinguishable.  However,  if  we 
are  to  compare  the  relative  computational  effort  of  the  two 
algorithms,  we  hnd  a  signihcant  improvement.  Table  I(top) 
shows  the  computational  time  and  accuracy  for  each  step 
of  the  FSP  and  the  slow  manifold  FSP  algorithm.  From 
the  table  we  hnd  that  the  time  scale  reduction  reduces  the 
computational  effort  at  each  step  by  a  factor  of  almost 
1000  compared  to  the  original  FSP  algorithm.  For  further 
comparison  I(bottom)  shows  the  total  computational  effort 
of  the  FSP  and  the  slow  manifold  FSP,  as  well  as  two  Monte 
Carlo  methods:  the  original  SSA  and  an  SSA  approximation 
very  similar  to  the  Slow  Scale  SSA  [7]  in  which  we  have 
applied  the  same  time  scale  reduction  approach.  The  1-norm 
difference  between  the  original  FSP  and  the  slow  manifold 
FSP  was  found  to  be  6.6  x  10“^,  which  is  indeed  on  the 
same  order  as  e  =  5  x  10“^  and  is  smaller  than  the  required 
error  tolerance  5  =  1  x  10“^.  However,  by  making  this  small 
sacrihce  in  accuracy,  the  slow  manifold  FSP  converges  in  a 
fraction  of  the  time  of  any  of  the  other  methods. 

V.  Conclusion 

Until  recently,  it  was  thought  that  the  chemical  master 
equation  could  not  be  solved  analytically  except  for  the 
most  trivial  systems.  Previous  work  on  the  Finite  State 
Projection  demonstrated  that  for  many  biological  systems. 


TABLE  I 

(Top)  a  comparison  of  the  computational  effort  and 
ACCURACY  FOR  THE  FSP  AND  THE  SLOW-MANIFOLD  FSP 
ALGORITHMS  FOR  EACH  EXPANDING  CONFIGURATION  SET 

Xjj^  C  Xj2  c  Xjg  .  (Bottom)  A  comparison  of  the  total 

COMPUTATIONAL  EFFORT  AND  ACCURACY  OF  THE  FSP,  THE 

Slow-Manifold  FSP,  the  SSA  and  a  Slow  Scale  SSA  algorithm 
FOR  THE  solution  OF  THE  CME  ARISING  IN  THE  TOY  HEAT  SHOCK 
MODEL  AT  tf  =  300s.  All  computations  HAVE  BEEN  PERFORMED  IN 
Matlab  7.2  ON  A  2.0  MHz  PowerPC  Dual  G5. 


Configuration  set 

||Error||j 

Comp.  Time  (s) 

FSP 

Slow-Manifold  FSP 

Xjj:  S3  <  250,  S2  <  11 

0.97 

253 

0.39 

Xj^:  S3  <  300,62  <  11 

0.08 

439 

0.54 

Xj,:  S3  <  350,  S2  <  11 

2  X  lO”'^ 

647 

0.67 

Method 

#  Simulations 

Time  (s) 

IIErrorll, 

FSP 

_  a 

1472 

<  2  X  lO”'’ 

SSA 

10^ 

>  20000 

Ri  0.25 

Slow-Manifold  FSP 

- 

1.88 

R!  6.6  X  lO-'^ 

Slow  Scale  SSA 

10^ 

82 

R  0.24 

Slow  Scale  SSA 

10^ 

826 

R  0.066 

Slow  Scale  SSA 

10^ 

8130 

R  0.027 

"The  FSP  and  Slow-Manifold  FSP  algorithms  are  run  only  once 
with  a  specified  allowable  total  en'or  of  10“^. 


bulk  system  reductions  could  bring  models  closer  into  the 
fold  of  solvable  problems.  Here  we  have  shown  that  the 
Finite  State  Projection  method  can  be  further  enhanced  when 
solving  the  chemical  master  equation  for  systems  involving 
multiple  time  scales.  In  this  work  we  have  presented  a 
Slow  Manifold  version  of  the  FSP  algorithm.  Based  upon 
singular  perturbation  theory  and  our  previous  work  in  [18], 
this  algorithm  provides  a  powerful  computational  tool  for 
studying  intracelular  processes  and  gene  regulatory  networks. 

By  casting  the  stochastic  chemical  kinetics  problem  in  a 
familiar  linear  system  context,  the  FSP  provides  valuable 
insight  into  the  bewildering  complexity  that  intracellular 
processes  exhibit.  Model  reductions  not  only  speed  up 
computations,  but  they  also  enable  us  to  distinguish  which 
aspects  of  a  system  are  important  and  which  are  not.  It  is 
plausible  that  the  main  features  of  intracellular  dynamics 
can  be  captured  in  a  relatively  small  subset  of  the  state 
space,  as  the  results  obtained  by  FSP  reduction  on  the  heat 
shock  model  suggest.  Depending  on  the  observation  time  of 
interest,  some  reactions  can  be  neglected,  while  some  will 
contribute  only  through  their  averages.  Preliminary  success 
with  our  approach  gives  us  a  hope  that  relatively  simple 
models  for  intracellular  processes  can  be  discovered  in  the 
process  of  reducing  the  FSP  solution. 

Of  course,  one  can  easily  envision  that  additional  model 
reductions  may  be  possible  to  even  further  enhance  the  power 
of  the  Finite  State  Projection.  Indeed  other  reductions  based 
upon  control  theory  [17]  are  already  becoming  apparent. 
Also,  in  our  computations  we  have  used  off  the  shelf  nu¬ 
merical  routines  for  eigensystem  calculations  and  matrix  ex¬ 
ponentiation.  Further  improvements  in  computational  speed 


can  be  achieved  if  these  routines  are  optimized  for  matrices 
which  define  master  equations  and  their  special  properties. 
We  intend  to  investigate  these  possibilities  in  the  future. 
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